Data compression for screen data#81
Data compression for screen data#81schewskone wants to merge 34 commits intosensorium-competition:mainfrom
Conversation
…ges and mp4 videos
… for compatibility with allen-exporter and future datasets to avoid duplicate files
Adding small fix for ToTensor issue to decoder PR because seperate PR is unnessesary
There was a problem hiding this comment.
Pull Request Overview
This PR adds support for compressed screen data by extending trials to handle encoded images and videos, updating tests and data generation to work with JPEG/MP4, and adjusting interpolation interfaces and CI.
- Introduce EncodedImageTrial and EncodedVideoTrial with compressed data loaders
- Enhance test utilities and screen interpolation tests for both encoded and raw formats
- Update interpolation API to return only data, adjust downstream consumers, and add FFmpeg to CI
Reviewed Changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_sequence_interpolator.py | Update tests for new single-array return value of interpolate |
| tests/test_screen_interpolator.py | Add parameterized tests for encoded vs. raw screen data |
| tests/create_screen_data.py | Extend data generation to save JPEG/MP4 and emit metadata |
| experanto/interpolators.py | Remove valid returns, introduce image_names flag, add encoded trials |
| experanto/experiment.py | Adjust interpolate to drop valid output |
| experanto/datasets.py | Update dataset pipelines to match new return signature |
| configs/default.yaml | Add image_names configuration |
| .github/workflows/test.yml | Install FFmpeg for encoding dependencies |
Comments suppressed due to low confidence (4)
experanto/interpolators.py:152
- The return type annotation still indicates a tuple, but the method now returns only a single array. Update the type hint and docstring to reflect the new signature.
def interpolate(self, times: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
experanto/interpolators.py:425
- [nitpick] Using
formatshadows the built-in Python function. Consider renaming this variable tofile_formatto avoid confusion.
format = metadata.get("file_format")
tests/create_screen_data.py:14
- The new
encodedparameter isn't documented. Please update the function docstring to explain whatencodeddoes and what formats it controls.
def create_screen_data(
tests/create_screen_data.py:118
- The code calls
shutil.rmtreebutshutilis not imported. Addimport shutilat the top of the module.
shutil.rmtree(SCREEN_ROOT)
3c5bb5a to
b3b0079
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 10 changed files in this pull request and generated 16 comments.
Comments suppressed due to low confidence (1)
experanto/interpolators.py:826
- This call to ScreenTrial.get_data_ in an initialization method is overridden by EncodedImageTrial.get_data_.
This call to ScreenTrial.get_data_ in an initialization method is overridden by EncodedVideoTrial.get_data_.
This call to ScreenTrial.get_data_ in an initialization method is overridden by BlankTrial.get_data_.
This call to ScreenTrial.get_data_ in an initialization method is overridden by InvalidTrial.get_data_.
self._cached_data = self.get_data_()
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def __init__( | ||
| self, data_file_name, meta_data, shared_decoder=None, cache_data: bool = False | ||
| ) -> None: | ||
| super().__init__( | ||
| data_file_name, | ||
| meta_data, | ||
| tuple(meta_data.get("image_size")), | ||
| meta_data.get("first_frame_idx"), | ||
| meta_data.get("num_frames"), | ||
| cache_data=cache_data, | ||
| ) | ||
| self.video_decoder = shared_decoder | ||
| if self.video_decoder is None: | ||
| raise ValueError( | ||
| "EncodedVideoTrial requires a shared_decoder to be provided." | ||
| ) |
There was a problem hiding this comment.
There's a critical initialization order bug. The parent class ScreenTrial.init (called via super()) will attempt to cache data by calling get_data_() at line 826 if cache_data=True. However, self.video_decoder is only set after super().init() completes (line 923). This means get_data_() will be called before video_decoder is initialized, causing the check at lines 924-927 to raise a ValueError. The video_decoder should be set before calling super().init(), or caching should be disabled for EncodedVideoTrial since it requires frame_indices.
There was a problem hiding this comment.
@schewskone - just to be sure - have you checked this does not actually happen?
(in default configs _cache_data is set to False, so we might have missed it)
|
|
||
| frames = np.random.rand(n_frames, *frame_shape).astype(np.float32) | ||
| # Generate frames with values in [0, 255] for better encoding | ||
| frames = (np.random.rand(n_frames, *frame_shape) * 255).astype(np.uint8) |
There was a problem hiding this comment.
@schewskone why is it now np.uint8 and not float32?
There was a problem hiding this comment.
I get not all of the asserts fro valid should be removed here.
@schewskone could you please take a look and plug some of them back, f.e. line 37 and 41-43 probably should stay, same for 71 and 75-77
(I merged it with the current main - check if you disagree with stn)
| file_ext = ".jpg" | ||
| else: | ||
| # Save as numpy array (original behavior) | ||
| np.save(data_dir / f"{i:05d}.npy", frames[i].astype(np.uint8)) |
There was a problem hiding this comment.
@schewskone it used to be np.float32 here in test as well - why have you changed it and does it match the real exported data?
Also, why there are no grayscale .jpeg saved? (e.g. 1 channel ones)
| # Initialize the decoder only once per unique file | ||
| # Assuming ScreenTrial.create or a helper can return just a decoder | ||
| if self.device == "cuda": | ||
| with set_cuda_backend("beta"): |
There was a problem hiding this comment.
@schewskone could you please add a comment here what set_cuda_backend("beta") does? and is it necessary to have it here because moving decoded data from CPU to GPU after interpolating is much slower? have you profiled it?
|
|
||
| # 2. Logic to handle shared video decoders | ||
| decoder_to_use = None | ||
| if file_format in [".mp4", ".avi", ".mov"]: # Add your video formats here |
There was a problem hiding this comment.
@schewskone should we check that a file_format is .npy otherwise and throw a NonImplemented error is not?
| if data_file_name not in shared_decoders: | ||
| # Initialize the decoder only once per unique file | ||
| # Assuming ScreenTrial.create or a helper can return just a decoder | ||
| if self.device == "cuda": |
There was a problem hiding this comment.
what is device cuda:0? I would suggest checking if device contains cuda
| def _initialize_decoder(self, data_file_name): | ||
| decoder = VideoDecoder( | ||
| str(data_file_name), |
There was a problem hiding this comment.
@schewskone why do you need to cast it to str? could it be sth else?
| def _initialize_decoder(self, data_file_name): | |
| decoder = VideoDecoder( | |
| str(data_file_name), | |
| def _initialize_decoder(self, data_file_name: str): | |
| decoder = VideoDecoder( | |
| data_file_name, |
| ) -> Union[tuple[dict, dict], dict, tuple[np.ndarray, np.ndarray], np.ndarray]: | ||
| if device is None: | ||
| values = {} | ||
| valid = {} |
There was a problem hiding this comment.
This needs to be returned back!
| ) | ||
|
|
||
|
|
||
| class EncodedImageTrial(ScreenTrial): |
There was a problem hiding this comment.
@schewskone where is it actually used? I cannot find it in being called explicitly but I might be missing sth (the only place I see it could be called now is lines 782-783)
| cls = globals()[class_name] | ||
|
|
||
| # Pass shared_decoder only for EncodedVideoTrials | ||
| if cls is EncodedVideoTrial: |
There was a problem hiding this comment.
@schewskone could we please add a small comment here why EncodedImageTrial don't need a shared decoder?
|
|
||
| def get_data_(self) -> np.array: | ||
| """Override base implementation to load compressed images""" | ||
| img = cv2.imread(str(self.data_file_name)) # returns BGR |
There was a problem hiding this comment.
I guess here even if the original .jpg is 1 channel (grayscale) image - cv2.imread would upload 3 duplicated channels. Do we want to catch it somehow, @schewskone (Warnings? or set a flag to cast things to 1 channel if the user knows their jpg are grayscale?)?
| if img is None: | ||
| raise ValueError(f"Could not read image file: {self.data_file_name}") | ||
| # Convert BGR to RGB | ||
| img = img[:, :, [2, 1, 0]] |
There was a problem hiding this comment.
@schewskone why not cv2.cvtColor(img, cv2.COLOR_BGR2RGB)? I'm not 100% sure but quick googling says that cv2 should be faster (both create a contagious copy of the image though)
|
|
||
| def get_data_(self, frame_indices=None) -> np.array: | ||
| """Override base implementation to load compressed videos""" | ||
| # Frame_indices is only ever None for caching purposes, we then decode entire video and cache it |
There was a problem hiding this comment.
should we maybe add an assert then, that if self._cached_data is None then frame_indices should not be None?
| frames = frames[:, [2, 1, 0], ...] | ||
| # Reorder dimensions | ||
| frames = frames.permute(0, 2, 3, 1).contiguous() # T,H,W,C |
There was a problem hiding this comment.
why not to do it together? because it looks like some dimensions end up being moved twice
| "EncodedVideoTrial requires a shared_decoder to be provided." | ||
| ) | ||
|
|
||
| def get_data(self, frame_indices) -> np.array: |
There was a problem hiding this comment.
| def get_data(self, frame_indices) -> np.array: | |
| def get_data(self, frame_indices) -> np.ndarray: |
| out = out.transpose( | ||
| 0, 3, 1, 2 | ||
| ) # transform into (T, C, H, W) after finishing with Cv2 operation |
There was a problem hiding this comment.
@schewskone should this reordering really be here? because it has not been here before. is it back compatible with the numpy files?
pollytur
left a comment
There was a problem hiding this comment.
please look at the comments - at least test need to plug back valid checks for sure
Implemented support for compressed screen formats
get_data_()methods that support encoded formats.Right now the VideoDecoder decodes the entire video and returns that to the interpolation function. This can be optimized with the help of sliced decoding which would require changes to the interpolation method. @pollytur and I decided we should discuss first and implement it later on if deemed necessary.